Can continuous speech recognizers handle isolated speech?

نویسندگان

Fil Alleva

Xuedong Huang

Mei-Yuh Hwang

Li Jiang

چکیده

Continuous speech is far more natural and ecient than isolated speech for communication. However, for current state-of-the-art automatic speech recognition systems, isolated speech recognition (ISR) is far more accurate than continuous speech recognition (CSR). It is common practice in the speech research community to build CSR systems using only CS data. However, slowing of the speaking rate is a natural reaction for a user faced with the high error rates of current CSR systems. Ironically, CSR systems typically have a much higher word error rate when speakers slow down since the acoustic models are usually derived exclusively from continuous speech corpora. In this paper, we summarize our eorts to improve the robustness of our speaker-independent CSR system against speaking styles, without suering a recognition accuracy penalty. In particular the multi-style trained system described in this paper attains a 7.0% word error rate for a test set consisting of both isolated and continuous speech, in contrast to the 10.9% word error rate achieved by the same system trained only on continuous speech. Ó 1998 Elsevier Science B.V. All

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative feature weighting for HMM-based continuous speech recognizers

The Discriminative Feature Extraction (DFE) method provides an appropriate formalism for the design of the frontend feature extraction module in pattern classification systems. In the recent years, this formalism has been successfully applied to different speech recognition problems, like classification of vowels, classification of phonemes or isolated word recognition. The DFE formalism can be...

متن کامل

Recognition of Prosodic Factors and Detection of Landmarks for Improvements to Continuous Speech Recognition Systems

This thesis examines the usefulness of including prosodic and phonetic context information in the phoneme model of a speech recognizer. This is done creating a series of prosodic and phonetic models and then comparing the log likelihoods of each model. The comparison of log likelihoods shows that both prosodic and phonetic context information improve the phoneme model for most phonemes. The pro...

متن کامل

Automatic Generation of Pronunciation Dictionaries

In this report we will describe a data driven approach for creating pronunciation dictionaries for a new unseen target language by voting among phoneme recognizers in nine different languages other than the target language. In this process recordings of the new language that are transcribed on word level are decoded by the phoneme recognizers. This results in a hypothesis of nine phonemes per t...

متن کامل

Dimensionality reduction of the enhanced feature set for the HMM-based speech recognizer

In the past few years, a great deal of research has been directed toward finding acoustic features that are effective for automatic speech recognition. Until recently, most of the speech recognizers used about 12 cepstral coefficients derived through the linear prediction analysis as recognition features [ 11. In [2,3], Furui investigated the use of temporal derivatives of cepstral coefficients...

متن کامل

A syllable based continuous speech recognizer for Tamil

This paper presents a novel technique for building a syllable based continuous speech recognizer when unannotated transcribed train data is available. We present two different segmentation algorithms to segment the speech and the corresponding text into comparable syllable like units. A group delay based two level segmentation algorithm is proposed to extract accurate syllable units from the sp...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Speech Communication

دوره 26 شماره

صفحات -

تاریخ انتشار 1997

Can continuous speech recognizers handle isolated speech?

نویسندگان

چکیده

منابع مشابه

Discriminative feature weighting for HMM-based continuous speech recognizers

Recognition of Prosodic Factors and Detection of Landmarks for Improvements to Continuous Speech Recognition Systems

Automatic Generation of Pronunciation Dictionaries

Dimensionality reduction of the enhanced feature set for the HMM-based speech recognizer

A syllable based continuous speech recognizer for Tamil

عنوان ژورنال:

اشتراک گذاری